```{r}
#| eval: false
head(mtcars)
```Gender Equity in College Sports
Massive Data Fundamentals final project
Policy context
Equity in Athletics Disclosure Act (EADA)
Data
Research questions
Data pipeline
Analyses
Exploratory
Unsupervised
Institution Level
Using Principle Component Analysis (PCA) and K-means clustering, practicioners are able to better understand universities that are similar to each other in their sports equity. Below, there are five clusters, which seem to emerge based on school sizes. On the far right side of the graph is the large universities with higher amounts of spending and larger teams. These universities do skew the graph a bit since they are spending so much more on their sports teams in comparison to the many small schools that are primarily on the left side of the graph.
There are additional use cases where a school administrator may want to see similar universities to their own in this clustering algorithm. The next two examples show what administrators for schools like Georgetown or Furman might see when entering in their own universities and finding their nearest neighbors. Some of the results may be obvious, but others may be less so, leading to a reason for different schools to connect on their gender equity in sports and how they can improve or understand other programs. For example, Georgetown is close to many schools in their conference, but schools like East Carolina and Old Dominion are not far away from them and would maybe be less obvious schools to connect with.
As for Furman, the schools nearest to them seem much less intuitive, with most being on the west coast which is quite the distance from the South Carolina school. These connections could help to facilitate unique discussions on how gender equity can be improved across various regions.